Reduction of Highly Nonstationary Ambient Noise by Integrating Spectral and Locational Characteristics of Speech and Noise for Robust ASR
نویسندگان
چکیده
This paper proposes a new multi-channel noise reduction approach that can appropriately handle highly nonstationary noise based on the spectral and locational features of speech and noise. We focus on a distant talking scenario, where a 2-ch microphone array receives a target speaker’s voice from the front while it receives highly nonstationary ambient noise from any direction. To cope well with this scenario, we introduce prior training not only for the spectral features of speech and noise but also for their locational features, and utilize them in a unified manner. The proposed method can distinguish rapid changes in speech and noise based mainly on their locational features, while it can reliably estimate the spectral shapes of the speech based largely on the spectral features. A filter-bank based implementation is also discussed to enable the proposed method to work in real time. Experiments using the PASCAL CHiME separation and recognition challenge task show the superiority of the proposed method as regards both speech quality and automatic speech recognition performance.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملSpectral Noise Tracking for Improved Nonstationary Noise Robust ASR
A method for nonstationary noise robust automatic speech recognition (ASR) is to first estimate the changing noise statistics and second clean up the features prior to recognition accordingly. Here, the first is accomplished by noise tracking in the spectral domain, while the second relies on Bayesian enhancement in the feature domain. In this way we take advantage of our recently proposedmaxim...
متن کاملTwo-Microphone Noise Reduction Using Spatial Information-Based Spectral Amplitude Estimation
Traditional two-microphone noise reduction algorithms to deal with highly nonstationary directional noises generally use the direction of arrival or phase difference information. The performance of these algorithms deteriorate when diffuse noises coexist with nonstationary directional noises in realistic adverse environments. In this paper, we present a two-channel noise reduction algorithm usi...
متن کاملHow Realistic is Artificially Added Noise?
Evaluations of algorithms for robust automatic speech recognition (ASR) are often based on artificial noisy speech instead of realistic noisy speech. In this paper we compare the ASR performance of speech with artificial additive noise to the performance of realistic noisy speech. All data was recorded during the same recording campaign and with nearly identical channel characteristics. The sim...
متن کاملAutomatic Speech Recognition of Human-Symbiotic Robot EMIEW
Automatic Speech Recognition (ASR) is an essential function of robots which live in the human world. Many works for ASR have been done for a long time. As a result, computers can recognize human speech well under silent environments. However, accuracy of ASR is greatly degraded under noisy environments. Therefore, noise reduction techniques for ASR are strongly desired. Many approaches based on...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011